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Abstract — In  this  paper,  a  fully  Bayesian  algorithm  for  endmember 
extraction  and  abundance  estimation  for  hyperspectral  imagery  is  in¬ 
troduced.  Following  the  linear  mixing  model,  each  pixel  spectrum  of 
the  hyperspectral  image  is  decomposed  as  a  linear  combination  of  pure 
endmember  spectra.  The  estimation  of  the  unknown  endmember  spectra 
and  the  corresponding  abundances  is  conducted  in  a  unified  manner  by 
generating  the  posterior  distribution  of  the  unknown  parameters  under 
a  hierarchical  Bayesian  model.  The  proposed  model  accounts  for  non¬ 
negativity  and  full-additivity  constraints,  and  exploits  the  fact  that  the 
endmember  spectra  lie  on  a  lower  dimensional  space.  A  Gibbs  algorithm 
is  proposed  to  generate  samples  distributed  according  to  the  posterior  of 
interest.  Simulation  results  illustrate  the  accuracy  of  the  proposed  joint 
Bayesian  estimator. 

I.  Introduction 

Over  the  last  years,  the  spectral  unmixing  problem  has  been  con¬ 
sidered  by  many  researchers.  Spectral  unmixing  consists  of  decom¬ 
posing  an  observed  pixel  spectmm  into  a  collection  of  pure  spectra, 
usually  referred  to  as  endmembers,  and  estimating  the  proportions  or 
abundances  of  each  material  in  the  image  pixels  [1].  To  describe  the 
mixture,  the  most  frequently  encountered  model  is  the  linear  mixing 
model  (EMM)  which  gives  a  good  approximation  in  the  reflective 
spectral  domain  ranging  from  0.4/im  to  2.5/im.  It  assumes  that  the 
observed  pixel  spectrum  is  a  weighted  linear  combination  of  the 
endmember  spectra. 

Spectral  unmixing  has  often  been  handled  as  a  two-step  procedure: 

i)  the  endmember  extraction  step  dedicated  to  the  identification  of 
the  macroscopic  materials  that  are  present  in  the  observed  scene  and 

ii)  the  inversion  step  which  consists  of  estimating  the  proportions 
of  the  materials  previously  identified.  This  paper  proposes  an  al¬ 
gorithm  that  estimates  the  endmember  spectra  and  their  respective 
abundances  jointly.  This  approach  casts  spectral  unmixing  as  a  blind 
source  separation  (BSS)  problem.  The  Bayesian  model  studied  in 
this  paper  uses  a  Gibbs  sampling  algorithm  to  efficiently  solve  the 
constrained  spectral  unmixing  problem  without  requiring  the  presence 
of  pure  pixels  in  the  hyperspectral  image.  In  many  works,  Bayesian 
estimation  approaches  have  been  adopted  to  solve  BSS  problems  like 
spectral  unmixing.  The  Bayesian  formulation  allows  one  to  directly 
incorporate  constraints  into  the  model.  These  constraints  include 
sparsity  [2];  non-negativity  [3];  full  additivity  (sum-to-one  constraint) 
[4].  In  this  paper,  prior  distributions  are  proposed  for  the  abundances 
and  endmember  spectra  to  enforce  the  constraints  inherent  to  the 
hyperspectral  mixing  model.  These  constraints  include  non-negativity 
and  full-additivity  of  the  abundance  coefficients  (as  in  [4])  and  non¬ 
negativity  of  the  endmember  spectra. 

Moreover,  the  proposed  joint  spectral  unmixing  approach  is  able 
to  solve  the  endmember  spectrum  estimation  problem  directly  on  a 
lower  dimensional  space  within  a  Bayesian  framework.  We  believe 
that  this  is  one  of  the  principal  factors  leading  to  performance 
improvements  that  we  show  in  Section  V.  The  problem  of  hyperpa¬ 
rameter  selection  in  our  Bayesian  model  is  circumvented  by  adopting 


the  hierarchical  Bayesian  approach  of  [4]  that  produces  a  parameter- 
independent  Bayesian  posterior  distribution  for  the  endmember  spec¬ 
tra  and  abundances.  To  overcome  the  complexity  of  the  full  posterior 
distribution,  a  Gibbs  sampling  strategy  is  derived  to  approximate 
standard  Bayesian  estimators,  e.g.,  the  minimum  mean  squared  error 
(MMSE)  estimator.  Moreover,  as  the  full  posterior  distribution  of 
all  the  unknown  parameters  is  available,  confidence  intervals  can  be 
easily  computed.  These  measures  allow  one  to  quantify  the  accuracy 
of  the  different  estimates. 

II.  Linear  mixing  model  and  problem  statement 

Consider  P  pixels  of  an  hyperspectral  image  acquired  in  L  spectral 
bands.  According  to  the  linear  mixing  model  (EMM),  described  for 
instance  in  [1],  the  L-spectrum  . . . ,  of  the  pth 

pixel  (p  =  1, . . . ,  P)  is  assumed  to  be  a  linear  combination  of  R 
spectra  rtir  corrupted  by  an  additive  Gaussian  noise 
R 

Vp  =  '^mrap,r  +  rip  (1) 

r=l 

where  . . . ,  denotes  the  spectrum  of  the  rth 

material,  ap^r  is  the  fraction  of  the  rth  material  in  the  pth  observation, 
R  is  the  number  of  materials,  L  is  the  number  of  available  spectral 
bands  and  P  is  the  number  of  observations  (pixels).  Moreover,  in 
(1),  rip  =  [np_i, . . .  ,np,_L]^  is  an  additive  noise  sequence  which  is 
assumed  to  be  an  independent  and  identically  distributed  (i.i.d.)  zero- 
mean  Gaussian  sequence  with  covariance  matrix  Sn  =  where 

Ii  is  the  identity  matrix  of  dimension  L  x  L,  i.e., 

np  ~  Af  (Oi,  Sn) .  (2) 

Finally,  note  that  the  model  in  (1)  can  be  easily  modified  (see  [5] 
and  [4]). 

Due  to  physical  considerations  [1],  the  fraction  vectors  Op  = 
[app, . . . ,  ap.i?]^  in  (1)  satisfy  the  following  non-negativity  and  full- 
additivity  (or  sum-to-one)  constraints 

fflp.r  >0,  Vr  =  1, . . . ,  P, 

—  1 

2^r=l  “PT  ~ 

In  other  words,  the  p  abundance  vectors  belong  to  the  space 

A  =  :  ||a|jj^  =  1  and  a  ^  O}  (4) 

where  is  the  norm  defined  as  ||tc||j^  =  i  \Xil  and  a  y  0 
stands  for  the  set  of  inequalities  {or  >  Moreover,  the 

endmember  spectra  component  mr,i  must  satisfy  the  following  non¬ 
negativity  constraints 

mr-,1  >  0,  Vr  =  1, . . . ,  P,  V(  =  1, . . . ,  L.  (5) 

Considering  all  pixels,  standard  matrix  notation  yields  Y  = 
MA  +  N  where  Y  =  [j/j, . . . ,  t/p],  M  =  [mi, . . . ,  ms]. 
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A  =  [a\, . . .  ,ap\  and  N  =  [ni, . . . ,  np\.  In  this  work,  we  propose 
to  estimate  A  and  M  from  the  noisy  observations  Y  under  the 
constraints  in  (3)  and  (5). 


A.  Likelihood 


III.  Bayesian  model 


The  linear  mixing  model  defined  in  (I)  and  the  statistical  properties 
in  (2)  of  the  noise  vector  Up  result  in  a  conditionally  Gaussian 
distribution  for  the  observation  of  the  pth  pixel:  j/p|  AT ,  Op,  cr^  ~ 
M  (^Map,G^\L).  Assuming  independence  between  the  noise  se¬ 
quences  Up  (p  —  1, . . . ,  P),  the  likelihood  function  of  all  the 
observations  Y  is 


B.  Prior  model  for  the  endmember  spectra 

1}  Dimensionality  reduction:  It  is  interesting  to  note  that  the 
unobserved  matrix  X  =  MA  —  Y  —  N  is  rank  deficient  under 
the  linear  model  (1).  Consequently,  in  the  noise-free  case,  X  can 
be  represented  in  a  suitable  lower-dimensional  subset  Vk  of 
(R  —  1  <  K  <  L)  without  loss  of  information.  As  noted  in  [1], 
dimensionality  reduction  is  a  common  step  of  the  spectral  unmixing, 
adopted  by  numerous  endmember  extraction  algorithms  (EEAs),  such 
as  N-FINDR  [6]  or  PPI  [7].  Similarly,  we  propose  to  estimate  the 
projection  fr  (r  =  1, . . . ,  i?)  of  the  endmember  spectra  trir  in  the 
subspace  Vk-  The  identification  of  this  subspace  can  be  achieved 
via  a  standard  dimension  reduction  procedure.  In  the  sequel,  we 
propose  to  define  Vk  as  the  subspace  spanned  by  K  orthogonal  axes 
vi, . . .  ,vk  identified  by  a  principal  component  analysis  (PC A)  on 
the  observations  Y 


Vx  =  span(ui, . . . ,  uk)  •  (7) 

2 )  PCA  projection:  If  D  and  V  denote  the  diagonal  matrix  of  the 
K  highest  eigenvalues  of  the  empirical  covariance  matrix  and  the 
corresponding  eigenvector  matrix,  respectively,  the  PCA  projection 
tr  £  R^  of  the  endmember  spectrum  m-r  £  R^  is 

tr  =  P  {rrir  -  y)  (8) 

with  P  —  D~2V.  Equivalently, 

rrir  =  Utr  +  y  (9) 

rri  1 

with  U  =  V  D2.  Note  that  in  the  subspace  Vr-i  obtained  for 
K  =  R  —  1,  the  vectors  p  form  a  simplex  that  standard 

EEAs  try  to  recover.  In  this  paper,  we  estimate  the  vertices  tr 
(r  =  of  this  simplex  using  a  Bayesian  approach.  The 

Bayesian  prior  distributions  for  the  projections  tr  (r  =  1, . . . ,  i?) 
are  introduced  in  the  following  paragraph. 

3)  Prior  distribution  for  the  projected  spectra:  To  ensure  non¬ 

negativity  constraints  (5)  of  the  corresponding  reconstructed  L  x  1 
spectra  rrir,  a  conjugate  multivariate  Gaussian  distribution  (MGD) 
-AfPr  (  c-r ,  truncated  on  the  set '  1  r  is  chosen  as  prior  distribution 

for  tr,  assumed  to  be  a  priori  independent.  The  set  Tr  C  Vk  is 
explicitly  defined  in  [8]  and  has  the  following  property 

{mi^r  >  0,  Vi  =  1,  .  .  .  ,  I/}  {tr  £  %}  ■  (10) 


identified  by  an  EEA,  e.g.,  N-FINDR.  The  variances  (r  = 
1, ...  ,R)  reflect  the  degree  of  confidence  given  to  this  prior  infor¬ 
mation.  When  no  additional  knowledge  is  available,  these  variances 
are  fixed  to  large  values. 

C.  Abundance  prior 

For  each  observed  pixel  p,  with  the  full  additivity  constraint  in  (3), 
the  abundance  vectors  (p  =  1, . . . ,  P)  can  be  rewritten  as 


and  Opp  =  1  —  Up.r.  Following  the  model  in  [4],  the  priors 

chosen  for  Cp  (p  —  1, ...  ,P)  are  uniform  distributions  on  the  simplex 
S  defined  by 

S  =  |cp;  ||cp||j  <  1  and  Cp  ^  o}  .  (11) 

Under  the  assumption  of  statistical  independence  between  the  abun¬ 
dance  vectors  Cp  (p  —  1, . . . ,  P),  the  full  prior  distribution  for  partial 
abundance  matrix  C  =  [ci, . . . ,  cp]^  can  be  written 

p 

/  (C)  cx  Is  (cp) .  (12) 

p=i 

As  noted  in  [4],  the  uniform  prior  distribution  reflects  a  lack  of  a 
priori  knowledge  about  the  abundance  vector.  However,  as  demon¬ 
strated  in  [8],  among  two  a  priori  equiprobable  solutions  of  the 
BSS  problem,  the  uniform  prior  allows  one  to  favor  a  posteriori 
the  solution  corresponding  to  the  polytope  in  the  projection  subset 
Vk  having  smallest  volume. 

D.  Noise  variance  prior 

A  conjugate  inverse-gamma  distribution  is  chosen  as  prior  for 


where  the  hyperparameter  o  will  be  fixed  to  =  2  and  7  will  be 
a  random  and  adjustable  hyperparameter,  whose  prior  distribution  is 
defined  below. 

E.  Prior  distribution  for  hyperparameter  7 

The  prior  for  7  is  a  non-informative  Jeffreys’  prior  which  reflects 
the  lack  of  knowledge  regarding  this  hyperparameter 

/(7)  oc  ili{+ (7).  (14) 

7 

F.  Posterior  distribution 

The  posterior  distribution  of  the  unknown  parameter  vector  6  = 
I C,  T,  (T^  }  can  be  computed  from  marginalization  using  the  follow¬ 
ing  hierarchical  structure 

f{e\Y)^  J  f{e,'r\Y)dycK  j  f{Y\e)f{e\y)f{^)dy  (15) 

where  /  and  /  (7)  are  defined  in  (6)  and  (14)  respectively. 

Moreover,  under  the  assumption  of  a  priori  independence  between 
C,  T  and  ,  the  following  result  can  be  obtained 

/(0|7)=/(C)/(T|e,s")/(a"|i.,7) 


This  paper  proposes  to  select  the  a  priori  mean  vectors  Sr  (r  = 
l,...,i?)  as  the  projected  spectra  of  pure  components  previously 


(16) 


where  /  (C),  f  (T  \  e,  s^)  and  /  |  7)  have  been  previously 

defined.  This  hierarchical  structure  allows  one  to  integrate  out  the 
hyperparameter  7  from  the  joint  distribution  /  {G,^\Y),  yielding 


TABLE  I 

Abundance  means  and  variances  of  each  endmember  in  each 
REGION  OF  THE  100  X  100  HYPERSPECTRAL  IMAGE. 


/  (C,T, a" |y)  cxP]l5  (c 


R 

X  JJ^exp 

r=l 


a  I Y  )  cx  ^ 


2s? 


Endm. 

Region 

Region  #2 

Region  #3  | 

mean 

var. 

mean 

var. 

mean 

var. 

#1 

0.60 

0.01 

0.25 

0.01 

0.25 

0.02 

#2 

0.20 

0.02 

0.50 

0.01 

0.15 

0.005 

#3 

0.20 

0.01 

0.25 

0.02 

0.60 

0.02 

(tr) 


P 

n 

p^i 

/  1  \  2  (  Vp~  (UT  +  ylJi)  ap  \ 

[[a-)  2a^  jj 

with 

f 

Ar  = 

y  alrU^-Ef'^U  +  \ik 

(17) 

•  ^  Sr 

.P=l 

where  1_r  =  [1, 


Deriving  the  Bayesian  estimators 


(e.g.,  MMSE  or  MAP)  from  the  posterior  distribution  in  (17)  remains 
intractable.  In  such  case,  it  is  very  common  to  use  Markov  chain 
Monte  Carlo  (MCMC)  methods  to  generate  samples  asymptotically 
distributed  according  to  the  posterior  distribution.  The  Bayesian 
estimators  can  then  be  approximated  using  these  samples.  The  next 
section  studies  a  Gibbs  sampling  strategy  allowing  one  to  generate 
samples  distributed  according  to  (17). 


T  f  —  .A./. 


^  ^  0^p,rU  Sjj  2 


Lp=i 


and 


—  Up  ^p,ry  ^  ^  ^P,j'^^j  • 


(22) 


(23) 


Note  that  rrij  =  Utj+y.  As  a  consequence,  the  posterior  distribution 
of  tr  is  the  following  truncated  MGD 


IV.  Gibbs  sampler 


tr  \  T.r,Cr,a‘^ ,Y  ~  A/r„  (t^ ,  A,.) . 


(24) 


Random  samples  (denoted  by  where  t  is  the  iteration  index) 
can  be  drawn  from  f  (^C,T,a^  \  Y)  using  a  Gibbs  sampler  [9].  This 
MCMC  technique  consists  of  generating  samples  } 

distributed  according  to  the  conditional  posterior  distributions  of  each 
parameter. 


A.  Sampling  from  f  (^C\T,  a^,Y^ 

Straightforward  computations  yield  for  each  observation 


f{cp\T,a\yp) 
oc  exp 

where 


(Cp  '^p')  (Cp  '^p') 


l5(Cp),  (18) 


Sp  — 


'Up  —  Sp 


(M.fl  -  S„  ^  (M.fl  - 

(M.fl  -  (t/p  -  tur) 


(19) 

with  and  where  M.r  denotes  the  matrix  M  whose 

i?th  column  has  been  removed.  As  a  consequence,  Cp|T,(j^,j/p  is 
distributed  according  to  an  MGD  truncated  on  the  simplex  S  in  (11) 


Cp  T,a  ,y  ~  A/fi  (ttp,  Sp)  . 


(20) 


Note  that  samples  can  be  drawn  from  an  MGD  truncated  on  a  simplex 
using  efficient  Monte  Carlo  simulation  strategies  described  in  [10]. 


B.  Sampling  from  f  (^T\C ,  ,Y'^ 

Define  T  ^  as  the  matrix  T  whose  rth  column  has  been  removed. 
Then  the  conditional  posterior  distribution  of  fr  (f  =  1, . . . ,  P)  is 


/  (tr\T.r,Cr,0^  ,Y)  (X 


exp 


--{tr-Tr)'^Ar^{tr-Tr)  lr„  (P)  ,  (21) 


C.  Sampling  from  f  (a‘^\C  ,T  ,Y^ 

The  conditional  distribution  of  ct^\C ,  T,  Y  is  the  following  inverse 
Gamma  distribution: 

a^\C,T,Y  |^:^,l^||y^-Map||"^  .  (25) 

V.  Simulations  on  synthetic  data 

To  illustrate  the  accuracy  of  the  proposed  algorithm,  simulations 
are  conducted  on  a  100  x  100  synthetic  image.  This  hyperspectral 
image  is  composed  of  three  different  regions  with  P  =  3  pure 
materials  representative  of  a  suburban  scene:  construction  concrete, 
green  grass  and  red  brick.  The  spectra  of  these  endmembers  have 
been  extracted  from  the  spectral  libraries  distributed  with  the  ENVI 
software  [11]  and  are  represented  in  Fig.  1  (top,  black  lines).  The 
reflectances  are  observed  in  L  =  413  spectral  bands  ranging  from 
0.4/im  to  2.5/im.  These  P  =  3  components  have  been  mixed  with 
proportions  that  have  been  randomly  generated  according  to  MGDs 
truncated  on  the  simplex  S  with  means  and  variances  reported  in 
Table  I.  The  generated  abundance  maps  have  been  depicted  in  Fig.  2 
(top)  in  gray  scale  where  a  white  (resp.  black)  pixel  stands  for  the 
presence  (resp.  absence)  of  the  material.  The  signal-to-noise  ratio  has 
been  tuned  to  SNRdB  =  15dB. 

The  resulting  hyperspectral  data  have  been  unmixed  by  the  pro¬ 
posed  algorithm.  First,  the  space  Vk  in  (7)  has  been  identified  by 
PGA  as  discussed  in  paragraph  III-B2.  The  hidden  mean  vectors 
(r  =  1, . . . ,  P)  of  the  normal  distributions  introduced  in  paragraph 
(III-B)  have  been  chosen  as  the  PGA  projections  of  endmembers 
previously  identified  by  N-FINDR.  The  hidden  variances  s?  have 
all  been  chosen  equal  to  sf  =  . . .  =  =  50  to  obtain  vague 

priors  (i.e.  large  variances).  The  Gibbs  sampler  has  been  run  with 
Amc  =  1300  iterations,  including  Nbi  =  300  burn-in  iterations.  The 
MMSE  estimates  of  the  abundance  vectors  ap  (p  —  1, P)  and 
the  projected  spectra  tr  (r  —  1,...,P)  have  been  approximated 
by  computing  empirical  averages  over  the  last  computed  outputs 
of  the  sampler.  The  corresponding  endmember  spectra  estimated 
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Fig.  1.  Actual  endmembers  (black  lines),  endmembers  estimated  by  N-FINDR  (blue  lines),  endmembers  estimated  by  VCA  (green  lines)  and  endmembers 
estimated  by  proposed  approach  (red  lines). 


by  the  proposed  algorithm  are  depicted  in  Fig.  1  (top,  red  lines). 
The  proposed  algorithm  clearly  outperforms  N-FINDR  and  VCA,  as 
shown  in  Fig.  1. 

Moreover,  the  MMSE  estimated  abundance  maps  are  depicted  in 
Fig.  2  (bottom)  and  are  clearly  in  good  agreement  with  the  simulated 
maps  (top).  Note  that  the  proposed  Bayesian  estimation  provides  the 
joint  posterior  distribution  of  the  unknown  parameters.  Specifically, 
this  posterior  distribution  allows  one  to  derive  confidence  intervals 
regarding  the  parameters  of  interest. 
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Fig.  2.  Top:  actual  endmember  abundance  maps.  Bottom:  estimated  end- 
member  abundance  maps. 


VI.  Conclusions 

This  paper  addressed  the  unsupervised  unmixing  problem  of  hy- 
perspectral  images,  i.e.  estimating  the  endmember  spectra  in  the 
observed  scene  and  their  respective  abundances  for  each  pixel.  A 
Bayesian  model  as  well  as  an  MCMC  algorithm  was  introduced, 
based  on  appropriate  priors  for  the  abundance  vectors  to  ensure  non¬ 
negativity  and  sum-to-one  constraints  inherent  to  the  linear  mixing 
model.  Instead  of  estimating  the  endmember  spectral  signatures  in 
the  observation  space,  we  proposed  to  estimate  their  projections 
onto  a  suitable  subspace.  In  this  subspace,  these  projections  were 
assigned  priors  that  satisfy  positivity  constraints  on  the  reconstructed 
endmember  spectra.  A  Gibbs  sampling  scheme  was  proposed  to 


generate  samples  asymptotically  distributed  according  to  this  pos¬ 
terior.  The  available  samples  were  then  used  to  approximate  the 
Bayesian  estimators  for  the  different  parameters  of  interest.  Results  of 
simulations  conducted  on  synthetic  hyperspectral  images  illustrated 
the  accuracy  of  the  proposed  Bayesian  method  when  compared  with 
other  algorithms  from  the  literature. 
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